Overview

Dataset statistics

Number of variables20
Number of observations55734
Missing cells203028
Missing cells (%)18.2%
Duplicate rows585
Duplicate rows (%)1.0%
Total size in memory8.5 MiB
Average record size in memory160.0 B

Variable types

Categorical11
Boolean2
Numeric7

Alerts

State Name has constant value ""Constant
Dataset has 585 (1.0%) duplicate rowsDuplicates
Original_Storage_Capacity is highly overall correlated with Present_Storage_CapacityHigh correlation
Present_Storage_Capacity is highly overall correlated with Original_Storage_CapacityHigh correlation
Reason_for_Water_Body_Use is highly overall correlated with Water_Body_StatusHigh correlation
Scheme_Status_Reason is highly overall correlated with Water_Body_Status and 2 other fieldsHigh correlation
Water_Body_Status is highly overall correlated with Reason_for_Water_Body_Use and 4 other fieldsHigh correlation
construcion_year is highly overall correlated with construction_costHigh correlation
construction_cost is highly overall correlated with construcion_yearHigh correlation
no_people_benefited_by_water_body is highly overall correlated with Water_Body_StatusHigh correlation
reason_water_body_in_use_name2 is highly overall correlated with Scheme_Status_Reason and 1 other fieldsHigh correlation
reason_water_body_in_use_name3 is highly overall correlated with Scheme_Status_Reason and 1 other fieldsHigh correlation
Area_Type is highly imbalanced (50.7%)Imbalance
Water_Body_Type is highly imbalanced (79.8%)Imbalance
Scheme_Status_Reason is highly imbalanced (67.7%)Imbalance
Repair_Renovation_Status is highly imbalanced (93.2%)Imbalance
construcion_year has 22811 (40.9%) missing valuesMissing
construction_cost has 22811 (40.9%) missing valuesMissing
Renovation_Year has 41682 (74.8%) missing valuesMissing
renovation_cost has 41682 (74.8%) missing valuesMissing
reason_water_body_in_use_name2 has 30662 (55.0%) missing valuesMissing
reason_water_body_in_use_name3 has 43380 (77.8%) missing valuesMissing
construction_cost is highly skewed (γ1 = 119.5673849)Skewed
renovation_cost is highly skewed (γ1 = 89.24662845)Skewed
Original_Storage_Capacity is highly skewed (γ1 = 109.9465346)Skewed
Present_Storage_Capacity is highly skewed (γ1 = 124.6175812)Skewed
no_people_benefited_by_water_body is highly skewed (γ1 = 114.8187724)Skewed
construction_cost has 766 (1.4%) zerosZeros

Reproduction

Analysis started2023-12-11 11:28:40.176706
Analysis finished2023-12-11 11:29:14.307554
Duration34.13 seconds
Software versionydata-profiling vv4.6.2
Download configurationconfig.json

Variables

Area_Type
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size435.6 KiB
Rural
49725 
Urban
6009 

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters278670
Distinct characters8
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowRural
2nd rowRural
3rd rowRural
4th rowRural
5th rowRural

Common Values

ValueCountFrequency (%)
Rural 49725
89.2%
Urban 6009
 
10.8%

Length

2023-12-11T16:59:14.504312image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T16:59:14.750119image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
ValueCountFrequency (%)
rural 49725
89.2%
urban 6009
 
10.8%

Most occurring characters

ValueCountFrequency (%)
r 55734
20.0%
a 55734
20.0%
R 49725
17.8%
u 49725
17.8%
l 49725
17.8%
U 6009
 
2.2%
b 6009
 
2.2%
n 6009
 
2.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 222936
80.0%
Uppercase Letter 55734
 
20.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r 55734
25.0%
a 55734
25.0%
u 49725
22.3%
l 49725
22.3%
b 6009
 
2.7%
n 6009
 
2.7%
Uppercase Letter
ValueCountFrequency (%)
R 49725
89.2%
U 6009
 
10.8%

Most occurring scripts

ValueCountFrequency (%)
Latin 278670
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
r 55734
20.0%
a 55734
20.0%
R 49725
17.8%
u 49725
17.8%
l 49725
17.8%
U 6009
 
2.2%
b 6009
 
2.2%
n 6009
 
2.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 278670
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
r 55734
20.0%
a 55734
20.0%
R 49725
17.8%
u 49725
17.8%
l 49725
17.8%
U 6009
 
2.2%
b 6009
 
2.2%
n 6009
 
2.2%

State Name
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size435.6 KiB
KERALA
55734 

Length

Max length6
Median length6
Mean length6
Min length6

Characters and Unicode

Total characters334404
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowKERALA
2nd rowKERALA
3rd rowKERALA
4th rowKERALA
5th rowKERALA

Common Values

ValueCountFrequency (%)
KERALA 55734
100.0%

Length

2023-12-11T16:59:15.007653image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T16:59:15.222702image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
ValueCountFrequency (%)
kerala 55734
100.0%

Most occurring characters

ValueCountFrequency (%)
A 111468
33.3%
K 55734
16.7%
E 55734
16.7%
R 55734
16.7%
L 55734
16.7%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 334404
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 111468
33.3%
K 55734
16.7%
E 55734
16.7%
R 55734
16.7%
L 55734
16.7%

Most occurring scripts

ValueCountFrequency (%)
Latin 334404
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 111468
33.3%
K 55734
16.7%
E 55734
16.7%
R 55734
16.7%
L 55734
16.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 334404
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 111468
33.3%
K 55734
16.7%
E 55734
16.7%
R 55734
16.7%
L 55734
16.7%

District Name
Categorical

Distinct14
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size435.6 KiB
Kozhikode
6192 
Palakkad
5988 
Malappuram
5983 
Kannur
5314 
Thrissur
5023 
Other values (9)
27234 

Length

Max length18
Median length14
Mean length8.6672049
Min length6

Characters and Unicode

Total characters483058
Distinct characters27
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowKollam
2nd rowKollam
3rd rowKollam
4th rowKollam
5th rowPalakkad

Common Values

ValueCountFrequency (%)
Kozhikode 6192
11.1%
Palakkad 5988
10.7%
Malappuram 5983
10.7%
Kannur 5314
9.5%
Thrissur 5023
9.0%
Ernakulam 4416
7.9%
Alappuzha 4239
7.6%
Idukki 3792
6.8%
Kottayam 3506
6.3%
Kasargod 2880
 
5.2%
Other values (4) 8401
15.1%

Length

2023-12-11T16:59:15.481817image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
kozhikode 6192
11.1%
palakkad 5988
10.7%
malappuram 5983
10.7%
kannur 5314
9.5%
thrissur 5023
9.0%
ernakulam 4416
7.9%
alappuzha 4239
7.6%
idukki 3792
6.8%
kottayam 3506
6.3%
kasargod 2880
 
5.2%
Other values (4) 8401
15.1%

Most occurring characters

ValueCountFrequency (%)
a 95914
19.9%
u 34065
 
7.1%
r 33937
 
7.0%
k 30168
 
6.2%
l 25330
 
5.2%
n 23742
 
4.9%
h 23666
 
4.9%
p 23093
 
4.8%
o 21122
 
4.4%
d 20795
 
4.3%
Other values (17) 151226
31.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 427324
88.5%
Uppercase Letter 55734
 
11.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 95914
22.4%
u 34065
 
8.0%
r 33937
 
7.9%
k 30168
 
7.1%
l 25330
 
5.9%
n 23742
 
5.6%
h 23666
 
5.5%
p 23093
 
5.4%
o 21122
 
4.9%
d 20795
 
4.9%
Other values (9) 95492
22.3%
Uppercase Letter
ValueCountFrequency (%)
K 20244
36.3%
T 7672
 
13.8%
P 7445
 
13.4%
M 5983
 
10.7%
E 4416
 
7.9%
A 4239
 
7.6%
I 3792
 
6.8%
W 1943
 
3.5%

Most occurring scripts

ValueCountFrequency (%)
Latin 483058
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 95914
19.9%
u 34065
 
7.1%
r 33937
 
7.0%
k 30168
 
6.2%
l 25330
 
5.2%
n 23742
 
4.9%
h 23666
 
4.9%
p 23093
 
4.8%
o 21122
 
4.4%
d 20795
 
4.3%
Other values (17) 151226
31.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 483058
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 95914
19.9%
u 34065
 
7.1%
r 33937
 
7.0%
k 30168
 
6.2%
l 25330
 
5.2%
n 23742
 
4.9%
h 23666
 
4.9%
p 23093
 
4.8%
o 21122
 
4.4%
d 20795
 
4.3%
Other values (17) 151226
31.3%

Water_Body_Type
Categorical

IMBALANCE 

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size435.6 KiB
Ponds
51007 
Water consv schemes/percolation tanks/check-dams
 
3349
Tank
 
848
Others
 
463
Reservoirs
 
63

Length

Max length48
Median length5
Mean length7.5825708
Min length4

Characters and Unicode

Total characters422607
Distinct characters25
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPonds
2nd rowPonds
3rd rowPonds
4th rowPonds
5th rowPonds

Common Values

ValueCountFrequency (%)
Ponds 51007
91.5%
Water consv schemes/percolation tanks/check-dams 3349
 
6.0%
Tank 848
 
1.5%
Others 463
 
0.8%
Reservoirs 63
 
0.1%
Lakes 4
 
< 0.1%

Length

2023-12-11T16:59:15.780344image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T16:59:16.049307image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
ValueCountFrequency (%)
ponds 51007
77.5%
water 3349
 
5.1%
consv 3349
 
5.1%
schemes/percolation 3349
 
5.1%
tanks/check-dams 3349
 
5.1%
tank 848
 
1.3%
others 463
 
0.7%
reservoirs 63
 
0.1%
lakes 4
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
s 68345
16.2%
n 61902
14.6%
o 61117
14.5%
d 54356
12.9%
P 51007
12.1%
e 17338
 
4.1%
c 16745
 
4.0%
a 14248
 
3.4%
t 10510
 
2.5%
10047
 
2.4%
Other values (15) 56992
13.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 346779
82.1%
Uppercase Letter 55734
 
13.2%
Space Separator 10047
 
2.4%
Other Punctuation 6698
 
1.6%
Dash Punctuation 3349
 
0.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
s 68345
19.7%
n 61902
17.9%
o 61117
17.6%
d 54356
15.7%
e 17338
 
5.0%
c 16745
 
4.8%
a 14248
 
4.1%
t 10510
 
3.0%
k 7550
 
2.2%
r 7287
 
2.1%
Other values (6) 27381
7.9%
Uppercase Letter
ValueCountFrequency (%)
P 51007
91.5%
W 3349
 
6.0%
T 848
 
1.5%
O 463
 
0.8%
R 63
 
0.1%
L 4
 
< 0.1%
Space Separator
ValueCountFrequency (%)
10047
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 6698
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 3349
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 402513
95.2%
Common 20094
 
4.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
s 68345
17.0%
n 61902
15.4%
o 61117
15.2%
d 54356
13.5%
P 51007
12.7%
e 17338
 
4.3%
c 16745
 
4.2%
a 14248
 
3.5%
t 10510
 
2.6%
k 7550
 
1.9%
Other values (12) 39395
9.8%
Common
ValueCountFrequency (%)
10047
50.0%
/ 6698
33.3%
- 3349
 
16.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 422607
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
s 68345
16.2%
n 61902
14.6%
o 61117
14.5%
d 54356
12.9%
P 51007
12.1%
e 17338
 
4.1%
c 16745
 
4.0%
a 14248
 
3.4%
t 10510
 
2.5%
10047
 
2.4%
Other values (15) 56992
13.5%

Water_Body_Status
Boolean

HIGH CORRELATION 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size54.6 KiB
True
46550 
False
9184 
ValueCountFrequency (%)
True 46550
83.5%
False 9184
 
16.5%
2023-12-11T16:59:16.327369image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Reason_for_Water_Body_Use
Categorical

HIGH CORRELATION 

Distinct9
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size435.6 KiB
Irrigation
20038 
Domestic/Drinking
10192 
Not Specified
9184 
Ground water recharge
6199 
Religious
3591 
Other values (4)
6530 

Length

Max length21
Median length17
Mean length12.821617
Min length5

Characters and Unicode

Total characters714600
Distinct characters29
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowDomestic/Drinking
2nd rowDomestic/Drinking
3rd rowNot Specified
4th rowDomestic/Drinking
5th rowNot Specified

Common Values

ValueCountFrequency (%)
Irrigation 20038
36.0%
Domestic/Drinking 10192
18.3%
Not Specified 9184
16.5%
Ground water recharge 6199
 
11.1%
Religious 3591
 
6.4%
Pisciculture 2663
 
4.8%
Other 2312
 
4.1%
Recreation 1295
 
2.3%
Industrial 260
 
0.5%

Length

2023-12-11T16:59:16.617656image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T16:59:16.936362image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
ValueCountFrequency (%)
irrigation 20038
25.9%
domestic/drinking 10192
13.2%
not 9184
11.9%
specified 9184
11.9%
ground 6199
 
8.0%
water 6199
 
8.0%
recharge 6199
 
8.0%
religious 3591
 
4.6%
pisciculture 2663
 
3.4%
other 2312
 
3.0%
Other values (2) 1555
 
2.0%

Most occurring characters

ValueCountFrequency (%)
i 103083
14.4%
r 81594
11.4%
e 58313
 
8.2%
t 52143
 
7.3%
o 50499
 
7.1%
n 48176
 
6.7%
g 40020
 
5.6%
a 33991
 
4.8%
c 32196
 
4.5%
21582
 
3.0%
Other values (19) 193003
27.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 607716
85.0%
Uppercase Letter 75110
 
10.5%
Space Separator 21582
 
3.0%
Other Punctuation 10192
 
1.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 103083
17.0%
r 81594
13.4%
e 58313
9.6%
t 52143
8.6%
o 50499
8.3%
n 48176
7.9%
g 40020
 
6.6%
a 33991
 
5.6%
c 32196
 
5.3%
s 16706
 
2.7%
Other values (9) 90995
15.0%
Uppercase Letter
ValueCountFrequency (%)
D 20384
27.1%
I 20298
27.0%
N 9184
12.2%
S 9184
12.2%
G 6199
 
8.3%
R 4886
 
6.5%
P 2663
 
3.5%
O 2312
 
3.1%
Space Separator
ValueCountFrequency (%)
21582
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 10192
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 682826
95.6%
Common 31774
 
4.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 103083
15.1%
r 81594
11.9%
e 58313
 
8.5%
t 52143
 
7.6%
o 50499
 
7.4%
n 48176
 
7.1%
g 40020
 
5.9%
a 33991
 
5.0%
c 32196
 
4.7%
D 20384
 
3.0%
Other values (17) 162427
23.8%
Common
ValueCountFrequency (%)
21582
67.9%
/ 10192
32.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 714600
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 103083
14.4%
r 81594
11.4%
e 58313
 
8.2%
t 52143
 
7.3%
o 50499
 
7.1%
n 48176
 
6.7%
g 40020
 
5.6%
a 33991
 
4.8%
c 32196
 
4.5%
21582
 
3.0%
Other values (19) 193003
27.0%

Scheme_Status_Reason
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct8
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size435.6 KiB
No reported problems
46550 
Others
 
4577
Siltation
 
2126
Destroyed beyond repair
 
1326
Dried-up
 
642
Other values (3)
 
513

Length

Max length27
Median length20
Mean length18.281175
Min length6

Characters and Unicode

Total characters1018883
Distinct characters25
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNo reported problems
2nd rowNo reported problems
3rd rowSiltation
4th rowNo reported problems
5th rowOthers

Common Values

ValueCountFrequency (%)
No reported problems 46550
83.5%
Others 4577
 
8.2%
Siltation 2126
 
3.8%
Destroyed beyond repair 1326
 
2.4%
Dried-up 642
 
1.2%
Salinity 287
 
0.5%
Construction 183
 
0.3%
Due to industrial effluents 43
 
0.1%

Length

2023-12-11T16:59:17.328369image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T16:59:17.621428image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
ValueCountFrequency (%)
no 46550
30.7%
reported 46550
30.7%
problems 46550
30.7%
others 4577
 
3.0%
siltation 2126
 
1.4%
destroyed 1326
 
0.9%
beyond 1326
 
0.9%
repair 1326
 
0.9%
dried-up 642
 
0.4%
salinity 287
 
0.2%
Other values (5) 355
 
0.2%

Most occurring characters

ValueCountFrequency (%)
e 150302
14.8%
r 149073
14.6%
o 144837
14.2%
95881
9.4%
p 95068
9.3%
t 57487
 
5.6%
s 52722
 
5.2%
d 49887
 
4.9%
l 49049
 
4.8%
b 47876
 
4.7%
Other values (15) 126701
12.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 866626
85.1%
Space Separator 95881
 
9.4%
Uppercase Letter 55734
 
5.5%
Dash Punctuation 642
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 150302
17.3%
r 149073
17.2%
o 144837
16.7%
p 95068
11.0%
t 57487
 
6.6%
s 52722
 
6.1%
d 49887
 
5.8%
l 49049
 
5.7%
b 47876
 
5.5%
m 46550
 
5.4%
Other values (8) 23775
 
2.7%
Uppercase Letter
ValueCountFrequency (%)
N 46550
83.5%
O 4577
 
8.2%
S 2413
 
4.3%
D 2011
 
3.6%
C 183
 
0.3%
Space Separator
ValueCountFrequency (%)
95881
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 642
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 922360
90.5%
Common 96523
 
9.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 150302
16.3%
r 149073
16.2%
o 144837
15.7%
p 95068
10.3%
t 57487
 
6.2%
s 52722
 
5.7%
d 49887
 
5.4%
l 49049
 
5.3%
b 47876
 
5.2%
N 46550
 
5.0%
Other values (13) 79509
8.6%
Common
ValueCountFrequency (%)
95881
99.3%
- 642
 
0.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1018883
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 150302
14.8%
r 149073
14.6%
o 144837
14.2%
95881
9.4%
p 95068
9.3%
t 57487
 
5.6%
s 52722
 
5.2%
d 49887
 
4.9%
l 49049
 
4.8%
b 47876
 
4.7%
Other values (15) 126701
12.4%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size435.6 KiB
Man-made
32923 
Natural
22811 

Length

Max length8
Median length8
Mean length7.5907166
Min length7

Characters and Unicode

Total characters423061
Distinct characters12
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNatural
2nd rowNatural
3rd rowNatural
4th rowNatural
5th rowNatural

Common Values

ValueCountFrequency (%)
Man-made 32923
59.1%
Natural 22811
40.9%

Length

2023-12-11T16:59:17.991590image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T16:59:18.230020image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
ValueCountFrequency (%)
man-made 32923
59.1%
natural 22811
40.9%

Most occurring characters

ValueCountFrequency (%)
a 111468
26.3%
M 32923
 
7.8%
n 32923
 
7.8%
- 32923
 
7.8%
m 32923
 
7.8%
d 32923
 
7.8%
e 32923
 
7.8%
N 22811
 
5.4%
t 22811
 
5.4%
u 22811
 
5.4%
Other values (2) 45622
10.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 334404
79.0%
Uppercase Letter 55734
 
13.2%
Dash Punctuation 32923
 
7.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 111468
33.3%
n 32923
 
9.8%
m 32923
 
9.8%
d 32923
 
9.8%
e 32923
 
9.8%
t 22811
 
6.8%
u 22811
 
6.8%
r 22811
 
6.8%
l 22811
 
6.8%
Uppercase Letter
ValueCountFrequency (%)
M 32923
59.1%
N 22811
40.9%
Dash Punctuation
ValueCountFrequency (%)
- 32923
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 390138
92.2%
Common 32923
 
7.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 111468
28.6%
M 32923
 
8.4%
n 32923
 
8.4%
m 32923
 
8.4%
d 32923
 
8.4%
e 32923
 
8.4%
N 22811
 
5.8%
t 22811
 
5.8%
u 22811
 
5.8%
r 22811
 
5.8%
Common
ValueCountFrequency (%)
- 32923
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 423061
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 111468
26.3%
M 32923
 
7.8%
n 32923
 
7.8%
- 32923
 
7.8%
m 32923
 
7.8%
d 32923
 
7.8%
e 32923
 
7.8%
N 22811
 
5.4%
t 22811
 
5.4%
u 22811
 
5.4%
Other values (2) 45622
10.8%

construcion_year
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct177
Distinct (%)0.5%
Missing22811
Missing (%)40.9%
Infinite0
Infinite (%)0.0%
Mean1987.9537
Minimum1519
Maximum2020
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size435.6 KiB
2023-12-11T16:59:18.519814image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Quantile statistics

Minimum1519
5-th percentile1940
Q11980
median1995
Q32005
95-th percentile2016
Maximum2020
Range501
Interquartile range (IQR)25

Descriptive statistics

Standard deviation29.667765
Coefficient of variation (CV)0.01492377
Kurtosis41.113718
Mean1987.9537
Median Absolute Deviation (MAD)14
Skewness-4.254521
Sum65449401
Variance880.17625
MonotonicityNot monotonic
2023-12-11T16:59:18.890640image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1990 2773
 
5.0%
1980 2390
 
4.3%
2000 2384
 
4.3%
1995 1255
 
2.3%
1970 1138
 
2.0%
1985 1127
 
2.0%
2010 1021
 
1.8%
2015 878
 
1.6%
1998 851
 
1.5%
2005 800
 
1.4%
Other values (167) 18306
32.8%
(Missing) 22811
40.9%
ValueCountFrequency (%)
1519 9
< 0.1%
1520 3
 
< 0.1%
1565 1
 
< 0.1%
1569 1
 
< 0.1%
1580 1
 
< 0.1%
1600 2
 
< 0.1%
1618 1
 
< 0.1%
1619 3
 
< 0.1%
1620 1
 
< 0.1%
1650 1
 
< 0.1%
ValueCountFrequency (%)
2020 27
 
< 0.1%
2019 174
 
0.3%
2018 697
1.3%
2017 709
1.3%
2016 682
1.2%
2015 878
1.6%
2014 777
1.4%
2013 537
1.0%
2012 552
1.0%
2011 281
 
0.5%

construction_cost
Real number (ℝ)

HIGH CORRELATION  MISSING  SKEWED  ZEROS 

Distinct1092
Distinct (%)3.3%
Missing22811
Missing (%)40.9%
Infinite0
Infinite (%)0.0%
Mean1013942.6
Minimum0
Maximum7.6 × 109
Zeros766
Zeros (%)1.4%
Negative0
Negative (%)0.0%
Memory size435.6 KiB
2023-12-11T16:59:19.242203image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile300
Q15000
median20000
Q375000
95-th percentile800000
Maximum7.6 × 109
Range7.6 × 109
Interquartile range (IQR)70000

Descriptive statistics

Standard deviation51358566
Coefficient of variation (CV)50.652342
Kurtosis16256.15
Mean1013942.6
Median Absolute Deviation (MAD)19000
Skewness119.56738
Sum3.3382032 × 1010
Variance2.6377023 × 1015
MonotonicityNot monotonic
2023-12-11T16:59:19.588737image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10000 2353
 
4.2%
5000 2031
 
3.6%
50000 2029
 
3.6%
20000 1450
 
2.6%
100000 1344
 
2.4%
1000 1245
 
2.2%
25000 1206
 
2.2%
15000 1178
 
2.1%
2000 1103
 
2.0%
30000 990
 
1.8%
Other values (1082) 17994
32.3%
(Missing) 22811
40.9%
ValueCountFrequency (%)
0 766
1.4%
1 8
 
< 0.1%
2 6
 
< 0.1%
4 1
 
< 0.1%
5 4
 
< 0.1%
9 1
 
< 0.1%
10 36
 
0.1%
12 2
 
< 0.1%
15 1
 
< 0.1%
20 30
 
0.1%
ValueCountFrequency (%)
7600000000 1
< 0.1%
4420000000 1
< 0.1%
1200000000 1
< 0.1%
1075700000 1
< 0.1%
1042000000 1
< 0.1%
1000000000 1
< 0.1%
930000000 1
< 0.1%
871200000 1
< 0.1%
800000000 1
< 0.1%
610000000 1
< 0.1%

Renovation_Year
Real number (ℝ)

MISSING 

Distinct68
Distinct (%)0.5%
Missing41682
Missing (%)74.8%
Infinite0
Infinite (%)0.0%
Mean2008.9353
Minimum1949
Maximum2020
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size435.6 KiB
2023-12-11T16:59:19.952770image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Quantile statistics

Minimum1949
5-th percentile1989
Q12003
median2013
Q32017
95-th percentile2018
Maximum2020
Range71
Interquartile range (IQR)14

Descriptive statistics

Standard deviation10.639445
Coefficient of variation (CV)0.0052960616
Kurtosis4.721049
Mean2008.9353
Median Absolute Deviation (MAD)5
Skewness-1.8692308
Sum28229559
Variance113.19779
MonotonicityNot monotonic
2023-12-11T16:59:20.304397image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2018 1926
 
3.5%
2015 1397
 
2.5%
2017 1284
 
2.3%
2016 1132
 
2.0%
2010 921
 
1.7%
2000 912
 
1.6%
2014 574
 
1.0%
2005 540
 
1.0%
2012 515
 
0.9%
2019 500
 
0.9%
Other values (58) 4351
 
7.8%
(Missing) 41682
74.8%
ValueCountFrequency (%)
1949 1
 
< 0.1%
1950 32
0.1%
1951 1
 
< 0.1%
1952 5
 
< 0.1%
1954 1
 
< 0.1%
1955 1
 
< 0.1%
1956 1
 
< 0.1%
1958 3
 
< 0.1%
1959 2
 
< 0.1%
1960 38
0.1%
ValueCountFrequency (%)
2020 33
 
0.1%
2019 500
 
0.9%
2018 1926
3.5%
2017 1284
2.3%
2016 1132
2.0%
2015 1397
2.5%
2014 574
 
1.0%
2013 341
 
0.6%
2012 515
 
0.9%
2011 197
 
0.4%

renovation_cost
Real number (ℝ)

MISSING  SKEWED 

Distinct490
Distinct (%)3.5%
Missing41682
Missing (%)74.8%
Infinite0
Infinite (%)0.0%
Mean179471.2
Minimum1
Maximum5.5 × 108
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size435.6 KiB
2023-12-11T16:59:20.651098image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1500
Q15000
median10000
Q350000
95-th percentile400000
Maximum5.5 × 108
Range5.5 × 108
Interquartile range (IQR)45000

Descriptive statistics

Standard deviation5215352.3
Coefficient of variation (CV)29.05955
Kurtosis8977.782
Mean179471.2
Median Absolute Deviation (MAD)8000
Skewness89.246628
Sum2.5219293 × 109
Variance2.71999 × 1013
MonotonicityNot monotonic
2023-12-11T16:59:20.988359image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
5000 1670
 
3.0%
10000 1486
 
2.7%
50000 782
 
1.4%
20000 617
 
1.1%
100000 542
 
1.0%
3000 535
 
1.0%
15000 497
 
0.9%
2000 479
 
0.9%
8000 477
 
0.9%
6000 475
 
0.9%
Other values (480) 6492
 
11.6%
(Missing) 41682
74.8%
ValueCountFrequency (%)
1 2
 
< 0.1%
2 3
 
< 0.1%
3 1
 
< 0.1%
4 1
 
< 0.1%
10 6
< 0.1%
20 8
< 0.1%
30 1
 
< 0.1%
50 6
< 0.1%
60 1
 
< 0.1%
80 1
 
< 0.1%
ValueCountFrequency (%)
550000000 1
< 0.1%
180000000 1
< 0.1%
170000000 1
< 0.1%
69600000 1
< 0.1%
60000000 1
< 0.1%
55000000 1
< 0.1%
53500000 1
< 0.1%
30000000 1
< 0.1%
29000000 1
< 0.1%
18000000 1
< 0.1%

Repair_Renovation_Status
Boolean

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size54.6 KiB
False
55281 
True
 
453
ValueCountFrequency (%)
False 55281
99.2%
True 453
 
0.8%
2023-12-11T16:59:21.258846image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Original_Storage_Capacity
Real number (ℝ)

HIGH CORRELATION  SKEWED 

Distinct2212
Distinct (%)4.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean172467.46
Minimum1
Maximum1.6979 × 109
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size435.6 KiB
2023-12-11T16:59:21.530883image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile30
Q1120
median350
Q31500
95-th percentile172467.46
Maximum1.6979 × 109
Range1.6979 × 109
Interquartile range (IQR)1380

Descriptive statistics

Standard deviation11803863
Coefficient of variation (CV)68.441102
Kurtosis13579.805
Mean172467.46
Median Absolute Deviation (MAD)290
Skewness109.94653
Sum9.6123015 × 109
Variance1.3933119 × 1014
MonotonicityNot monotonic
2023-12-11T16:59:21.884434image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
172467.4606 3812
 
6.8%
120 1909
 
3.4%
200 1821
 
3.3%
60 1779
 
3.2%
300 1688
 
3.0%
100 1436
 
2.6%
240 1300
 
2.3%
80 1203
 
2.2%
40 1147
 
2.1%
600 1133
 
2.0%
Other values (2202) 38506
69.1%
ValueCountFrequency (%)
1 12
 
< 0.1%
2 21
 
< 0.1%
3 24
 
< 0.1%
4 27
 
< 0.1%
5 19
 
< 0.1%
6 40
 
0.1%
7 28
 
0.1%
8 27
 
< 0.1%
9 54
 
0.1%
10 290
0.5%
ValueCountFrequency (%)
1697900000 1
< 0.1%
1459490000 1
< 0.1%
1089800000 1
< 0.1%
708200000 1
< 0.1%
504920000 1
< 0.1%
454140000 1
< 0.1%
446570000 1
< 0.1%
443230000 1
< 0.1%
215340000 1
< 0.1%
201000000 1
< 0.1%

Present_Storage_Capacity
Real number (ℝ)

HIGH CORRELATION  SKEWED 

Distinct2424
Distinct (%)4.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean142922.56
Minimum1
Maximum1.6979 × 109
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size435.6 KiB
2023-12-11T16:59:22.505945image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile20
Q190
median242
Q31000
95-th percentile142922.56
Maximum1.6979 × 109
Range1.6979 × 109
Interquartile range (IQR)910

Descriptive statistics

Standard deviation10937592
Coefficient of variation (CV)76.5281
Kurtosis16997.421
Mean142922.56
Median Absolute Deviation (MAD)202
Skewness124.61758
Sum7.965646 × 109
Variance1.1963092 × 1014
MonotonicityNot monotonic
2023-12-11T16:59:22.859923image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
142922.5615 3812
 
6.8%
100 2143
 
3.8%
200 1652
 
3.0%
150 1563
 
2.8%
300 1334
 
2.4%
50 1247
 
2.2%
80 1067
 
1.9%
400 899
 
1.6%
120 883
 
1.6%
60 863
 
1.5%
Other values (2414) 40271
72.3%
ValueCountFrequency (%)
1 63
 
0.1%
2 57
 
0.1%
3 51
 
0.1%
4 54
 
0.1%
5 113
 
0.2%
6 89
 
0.2%
7 74
 
0.1%
8 124
 
0.2%
9 102
 
0.2%
10 451
0.8%
ValueCountFrequency (%)
1697900000 1
< 0.1%
1404000000 1
< 0.1%
1064050000 1
< 0.1%
476160000 1
< 0.1%
352500000 1
< 0.1%
320000000 1
< 0.1%
254000000 1
< 0.1%
222500000 1
< 0.1%
157600000 1
< 0.1%
111890000 1
< 0.1%
Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size435.6 KiB
Full
33236 
Upto 3/4
17233 
Upto 1/2
3781 
Upto 1/4
 
1045
Nil/Negligible filled up
 
439

Length

Max length24
Median length4
Mean length5.7406969
Min length4

Characters and Unicode

Total characters319952
Distinct characters20
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowFull
2nd rowFull
3rd rowFull
4th rowFull
5th rowUpto 3/4

Common Values

ValueCountFrequency (%)
Full 33236
59.6%
Upto 3/4 17233
30.9%
Upto 1/2 3781
 
6.8%
Upto 1/4 1045
 
1.9%
Nil/Negligible filled up 439
 
0.8%

Length

2023-12-11T16:59:23.203300image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T16:59:23.482602image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
ValueCountFrequency (%)
full 33236
42.2%
upto 22059
28.0%
3/4 17233
21.9%
1/2 3781
 
4.8%
1/4 1045
 
1.3%
nil/negligible 439
 
0.6%
filled 439
 
0.6%
up 439
 
0.6%

Most occurring characters

ValueCountFrequency (%)
l 68667
21.5%
u 33675
10.5%
F 33236
10.4%
22937
 
7.2%
/ 22498
 
7.0%
p 22498
 
7.0%
U 22059
 
6.9%
t 22059
 
6.9%
o 22059
 
6.9%
4 18278
 
5.7%
Other values (10) 31986
10.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 174226
54.5%
Uppercase Letter 56173
 
17.6%
Decimal Number 44118
 
13.8%
Space Separator 22937
 
7.2%
Other Punctuation 22498
 
7.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
l 68667
39.4%
u 33675
19.3%
p 22498
 
12.9%
t 22059
 
12.7%
o 22059
 
12.7%
i 1756
 
1.0%
e 1317
 
0.8%
g 878
 
0.5%
b 439
 
0.3%
f 439
 
0.3%
Decimal Number
ValueCountFrequency (%)
4 18278
41.4%
3 17233
39.1%
1 4826
 
10.9%
2 3781
 
8.6%
Uppercase Letter
ValueCountFrequency (%)
F 33236
59.2%
U 22059
39.3%
N 878
 
1.6%
Space Separator
ValueCountFrequency (%)
22937
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 22498
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 230399
72.0%
Common 89553
 
28.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
l 68667
29.8%
u 33675
14.6%
F 33236
14.4%
p 22498
 
9.8%
U 22059
 
9.6%
t 22059
 
9.6%
o 22059
 
9.6%
i 1756
 
0.8%
e 1317
 
0.6%
N 878
 
0.4%
Other values (4) 2195
 
1.0%
Common
ValueCountFrequency (%)
22937
25.6%
/ 22498
25.1%
4 18278
20.4%
3 17233
19.2%
1 4826
 
5.4%
2 3781
 
4.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 319952
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
l 68667
21.5%
u 33675
10.5%
F 33236
10.4%
22937
 
7.2%
/ 22498
 
7.0%
p 22498
 
7.0%
U 22059
 
6.9%
t 22059
 
6.9%
o 22059
 
6.9%
4 18278
 
5.7%
Other values (10) 31986
10.0%
Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size435.6 KiB
Filled up every year
31424 
Usually filled up
17373 
Rarely filled up
5431 
Never filled up
 
1506

Length

Max length20
Median length20
Mean length18.539976
Min length15

Characters and Unicode

Total characters1033307
Distinct characters17
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowFilled up every year
2nd rowFilled up every year
3rd rowFilled up every year
4th rowFilled up every year
5th rowUsually filled up

Common Values

ValueCountFrequency (%)
Filled up every year 31424
56.4%
Usually filled up 17373
31.2%
Rarely filled up 5431
 
9.7%
Never filled up 1506
 
2.7%

Length

2023-12-11T16:59:23.791213image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T16:59:24.060359image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
ValueCountFrequency (%)
filled 55734
28.1%
up 55734
28.1%
every 31424
15.8%
year 31424
15.8%
usually 17373
 
8.7%
rarely 5431
 
2.7%
never 1506
 
0.8%

Most occurring characters

ValueCountFrequency (%)
e 158449
15.3%
l 151645
14.7%
142892
13.8%
y 85652
8.3%
u 73107
7.1%
r 69785
6.8%
d 55734
 
5.4%
p 55734
 
5.4%
i 55734
 
5.4%
a 54228
 
5.2%
Other values (7) 130347
12.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 834681
80.8%
Space Separator 142892
 
13.8%
Uppercase Letter 55734
 
5.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 158449
19.0%
l 151645
18.2%
y 85652
10.3%
u 73107
8.8%
r 69785
8.4%
d 55734
 
6.7%
p 55734
 
6.7%
i 55734
 
6.7%
a 54228
 
6.5%
v 32930
 
3.9%
Other values (2) 41683
 
5.0%
Uppercase Letter
ValueCountFrequency (%)
F 31424
56.4%
U 17373
31.2%
R 5431
 
9.7%
N 1506
 
2.7%
Space Separator
ValueCountFrequency (%)
142892
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 890415
86.2%
Common 142892
 
13.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 158449
17.8%
l 151645
17.0%
y 85652
9.6%
u 73107
8.2%
r 69785
7.8%
d 55734
 
6.3%
p 55734
 
6.3%
i 55734
 
6.3%
a 54228
 
6.1%
v 32930
 
3.7%
Other values (6) 97417
10.9%
Common
ValueCountFrequency (%)
142892
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1033307
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 158449
15.3%
l 151645
14.7%
142892
13.8%
y 85652
8.3%
u 73107
7.1%
r 69785
6.8%
d 55734
 
5.4%
p 55734
 
5.4%
i 55734
 
5.4%
a 54228
 
5.2%
Other values (7) 130347
12.6%

no_people_benefited_by_water_body
Real number (ℝ)

HIGH CORRELATION  SKEWED 

Distinct278
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean681.37848
Minimum1
Maximum5000000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size435.6 KiB
2023-12-11T16:59:24.406791image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q16
median18
Q3100
95-th percentile681
Maximum5000000
Range4999999
Interquartile range (IQR)94

Descriptive statistics

Standard deviation29400.983
Coefficient of variation (CV)43.149268
Kurtosis16821.517
Mean681.37848
Median Absolute Deviation (MAD)15
Skewness114.81877
Sum37975948
Variance8.6441779 × 108
MonotonicityNot monotonic
2023-12-11T16:59:24.764637image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
681 9184
16.5%
10 8220
14.7%
1 5020
 
9.0%
5 4436
 
8.0%
50 2954
 
5.3%
20 2537
 
4.6%
100 2147
 
3.9%
4 2057
 
3.7%
15 1772
 
3.2%
30 1672
 
3.0%
Other values (268) 15735
28.2%
ValueCountFrequency (%)
1 5020
9.0%
2 1018
 
1.8%
3 814
 
1.5%
4 2057
 
3.7%
5 4436
8.0%
6 1377
 
2.5%
7 677
 
1.2%
8 959
 
1.7%
9 233
 
0.4%
10 8220
14.7%
ValueCountFrequency (%)
5000000 1
 
< 0.1%
2500000 1
 
< 0.1%
2000000 2
 
< 0.1%
1100000 1
 
< 0.1%
1000000 2
 
< 0.1%
750000 1
 
< 0.1%
675000 1
 
< 0.1%
667800 1
 
< 0.1%
500000 8
< 0.1%
450000 2
 
< 0.1%

reason_water_body_in_use_name2
Categorical

HIGH CORRELATION  MISSING 

Distinct8
Distinct (%)< 0.1%
Missing30662
Missing (%)55.0%
Memory size435.6 KiB
Ground water recharge
12843 
Domestic/Drinking
5984 
Other
1732 
Pisciculture
1485 
Recreation
 
1226
Other values (3)
1802 

Length

Max length21
Median length21
Mean length17.048141
Min length5

Characters and Unicode

Total characters427431
Distinct characters25
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowGround water recharge
2nd rowGround water recharge
3rd rowGround water recharge
4th rowGround water recharge
5th rowGround water recharge

Common Values

ValueCountFrequency (%)
Ground water recharge 12843
23.0%
Domestic/Drinking 5984
 
10.7%
Other 1732
 
3.1%
Pisciculture 1485
 
2.7%
Recreation 1226
 
2.2%
Irrigation 873
 
1.6%
Religious 760
 
1.4%
Industrial 169
 
0.3%
(Missing) 30662
55.0%

Length

2023-12-11T16:59:25.102445image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T16:59:25.400502image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
ValueCountFrequency (%)
ground 12843
25.3%
water 12843
25.3%
recharge 12843
25.3%
domestic/drinking 5984
11.8%
other 1732
 
3.4%
pisciculture 1485
 
2.9%
recreation 1226
 
2.4%
irrigation 873
 
1.7%
religious 760
 
1.5%
industrial 169
 
0.3%

Most occurring characters

ValueCountFrequency (%)
r 63714
14.9%
e 50942
11.9%
a 27954
 
6.5%
n 27079
 
6.3%
25686
 
6.0%
i 25583
 
6.0%
t 24312
 
5.7%
c 23023
 
5.4%
o 21686
 
5.1%
g 20460
 
4.8%
Other values (15) 116992
27.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 364705
85.3%
Uppercase Letter 31056
 
7.3%
Space Separator 25686
 
6.0%
Other Punctuation 5984
 
1.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r 63714
17.5%
e 50942
14.0%
a 27954
7.7%
n 27079
7.4%
i 25583
7.0%
t 24312
 
6.7%
c 23023
 
6.3%
o 21686
 
5.9%
g 20460
 
5.6%
u 16742
 
4.6%
Other values (7) 63210
17.3%
Uppercase Letter
ValueCountFrequency (%)
G 12843
41.4%
D 11968
38.5%
R 1986
 
6.4%
O 1732
 
5.6%
P 1485
 
4.8%
I 1042
 
3.4%
Space Separator
ValueCountFrequency (%)
25686
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 5984
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 395761
92.6%
Common 31670
 
7.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
r 63714
16.1%
e 50942
12.9%
a 27954
 
7.1%
n 27079
 
6.8%
i 25583
 
6.5%
t 24312
 
6.1%
c 23023
 
5.8%
o 21686
 
5.5%
g 20460
 
5.2%
u 16742
 
4.2%
Other values (13) 94266
23.8%
Common
ValueCountFrequency (%)
25686
81.1%
/ 5984
 
18.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 427431
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
r 63714
14.9%
e 50942
11.9%
a 27954
 
6.5%
n 27079
 
6.3%
25686
 
6.0%
i 25583
 
6.0%
t 24312
 
5.7%
c 23023
 
5.4%
o 21686
 
5.1%
g 20460
 
4.8%
Other values (15) 116992
27.4%

reason_water_body_in_use_name3
Categorical

HIGH CORRELATION  MISSING 

Distinct8
Distinct (%)0.1%
Missing43380
Missing (%)77.8%
Memory size435.6 KiB
Ground water recharge
4797 
Other
4727 
Domestic/Drinking
1054 
Irrigation
892 
Recreation
 
356
Other values (3)
528 

Length

Max length21
Median length17
Mean length12.994334
Min length5

Characters and Unicode

Total characters160532
Distinct characters25
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPisciculture
2nd rowPisciculture
3rd rowPisciculture
4th rowPisciculture
5th rowPisciculture

Common Values

ValueCountFrequency (%)
Ground water recharge 4797
 
8.6%
Other 4727
 
8.5%
Domestic/Drinking 1054
 
1.9%
Irrigation 892
 
1.6%
Recreation 356
 
0.6%
Pisciculture 318
 
0.6%
Religious 154
 
0.3%
Industrial 56
 
0.1%
(Missing) 43380
77.8%

Length

2023-12-11T16:59:25.784005image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T16:59:26.096795image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
ValueCountFrequency (%)
ground 4797
21.9%
water 4797
21.9%
recharge 4797
21.9%
other 4727
21.5%
domestic/drinking 1054
 
4.8%
irrigation 892
 
4.1%
recreation 356
 
1.6%
pisciculture 318
 
1.4%
religious 154
 
0.7%
industrial 56
 
0.3%

Most occurring characters

ValueCountFrequency (%)
r 27483
17.1%
e 21356
13.3%
t 12200
 
7.6%
a 10898
 
6.8%
9594
 
6.0%
h 9524
 
5.9%
n 8209
 
5.1%
o 7253
 
4.5%
g 6897
 
4.3%
c 6843
 
4.3%
Other values (15) 40275
25.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 136476
85.0%
Uppercase Letter 13408
 
8.4%
Space Separator 9594
 
6.0%
Other Punctuation 1054
 
0.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r 27483
20.1%
e 21356
15.6%
t 12200
8.9%
a 10898
 
8.0%
h 9524
 
7.0%
n 8209
 
6.0%
o 7253
 
5.3%
g 6897
 
5.1%
c 6843
 
5.0%
i 6302
 
4.6%
Other values (7) 19511
14.3%
Uppercase Letter
ValueCountFrequency (%)
G 4797
35.8%
O 4727
35.3%
D 2108
15.7%
I 948
 
7.1%
R 510
 
3.8%
P 318
 
2.4%
Space Separator
ValueCountFrequency (%)
9594
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 1054
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 149884
93.4%
Common 10648
 
6.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
r 27483
18.3%
e 21356
14.2%
t 12200
 
8.1%
a 10898
 
7.3%
h 9524
 
6.4%
n 8209
 
5.5%
o 7253
 
4.8%
g 6897
 
4.6%
c 6843
 
4.6%
i 6302
 
4.2%
Other values (13) 32919
22.0%
Common
ValueCountFrequency (%)
9594
90.1%
/ 1054
 
9.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 160532
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
r 27483
17.1%
e 21356
13.3%
t 12200
 
7.6%
a 10898
 
6.8%
9594
 
6.0%
h 9524
 
5.9%
n 8209
 
5.1%
o 7253
 
4.5%
g 6897
 
4.3%
c 6843
 
4.3%
Other values (15) 40275
25.1%

Interactions

2023-12-11T16:59:10.284764image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-11T16:58:59.868118image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-11T16:59:01.627794image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-11T16:59:03.368042image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-11T16:59:05.101982image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-11T16:59:06.681573image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-11T16:59:08.396398image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-11T16:59:10.537133image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-11T16:59:00.119893image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-11T16:59:01.868309image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-11T16:59:03.611056image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-11T16:59:05.317950image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-11T16:59:06.915595image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-11T16:59:08.638633image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-11T16:59:10.771216image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-11T16:59:00.367800image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-11T16:59:02.114003image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-11T16:59:03.843413image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-11T16:59:05.519688image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-11T16:59:07.161866image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-11T16:59:08.881169image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-11T16:59:11.034449image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-11T16:59:00.621081image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-11T16:59:02.382641image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-11T16:59:04.092257image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-11T16:59:05.748744image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-11T16:59:07.413061image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-11T16:59:09.330361image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-11T16:59:11.278143image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-11T16:59:00.859910image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-11T16:59:02.594651image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-11T16:59:04.330721image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-11T16:59:05.968703image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-11T16:59:07.649249image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-11T16:59:09.559795image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-11T16:59:11.544322image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-11T16:59:01.117570image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-11T16:59:02.859638image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-11T16:59:04.600482image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-11T16:59:06.210876image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-11T16:59:07.899091image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-11T16:59:09.781718image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-11T16:59:11.810555image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-11T16:59:01.379537image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-11T16:59:03.118827image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-11T16:59:04.855628image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-11T16:59:06.445869image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-11T16:59:08.157977image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-11T16:59:10.029120image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Correlations

2023-12-11T16:59:26.408288image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Area_TypeDistrict NameOriginal_Storage_CapacityPresent_Storage_CapacityReason_for_Water_Body_UseRenovation_YearRepair_Renovation_StatusScheme_Status_ReasonWater_Body_NatureWater_Body_StatusWater_Body_Typeconstrucion_yearconstruction_costfilled_up_storage_namefilled_up_storage_space_nameno_people_benefited_by_water_bodyreason_water_body_in_use_name2reason_water_body_in_use_name3renovation_cost
Area_Type1.0000.1470.0490.0470.1370.0270.0260.0370.0520.0160.020-0.052-0.0270.0260.0240.0620.0750.0980.082
District Name0.1471.0000.1660.1670.167-0.0620.2300.0860.2930.1300.1160.1220.0640.1700.2000.0260.1370.1760.140
Original_Storage_Capacity0.0490.1661.0000.9770.011-0.0330.0000.0000.0040.0000.178-0.0650.2020.0000.0000.2450.0000.0000.274
Present_Storage_Capacity0.0470.1670.9771.0000.016-0.0240.0000.0000.0000.0000.149-0.0460.2100.0000.0000.2250.0000.0000.271
Reason_for_Water_Body_Use0.1370.1670.0110.0161.000-0.0440.0730.3780.0391.0000.062-0.063-0.0470.0680.0570.2220.2270.206-0.006
Renovation_Year0.027-0.062-0.033-0.024-0.0441.0000.0420.0790.0500.1300.0360.2940.1920.0450.043-0.1340.0360.0610.045
Repair_Renovation_Status0.0260.2300.0000.0000.0730.0421.0000.0180.0100.0190.0420.0980.0850.0090.0290.0080.0290.0240.065
Scheme_Status_Reason0.0370.0860.0000.0000.3780.0790.0181.0000.0251.0000.039-0.046-0.0480.0880.0710.3281.0001.000-0.072
Water_Body_Nature0.0520.2930.0040.0000.0390.0500.0100.0251.0000.0120.0350.0330.0330.0520.0500.0370.0540.0440.008
Water_Body_Status0.0160.1300.0000.0001.0000.1300.0191.0000.0121.0000.0460.0770.0410.1230.073-0.6321.0001.0000.119
Water_Body_Type0.0200.1160.1780.1490.0620.0360.0420.0390.0350.0461.0000.1190.2840.1110.1380.1250.0470.0410.099
construcion_year-0.0520.122-0.065-0.046-0.0630.2940.098-0.0460.0330.0770.1191.0000.5220.0400.037-0.1050.0620.045-0.099
construction_cost-0.0270.0640.2020.210-0.0470.1920.085-0.0480.0330.0410.2840.5221.0000.0000.0000.1270.0000.0000.159
filled_up_storage_name0.0260.1700.0000.0000.0680.0450.0090.0880.0520.1230.1110.0400.0001.0000.452-0.0000.0370.0670.103
filled_up_storage_space_name0.0240.2000.0000.0000.0570.0430.0290.0710.0500.0730.1380.0370.0000.4521.000-0.0050.0290.0670.040
no_people_benefited_by_water_body0.0620.0260.2450.2250.222-0.1340.0080.3280.037-0.6320.125-0.1050.127-0.000-0.0051.0000.0000.0000.162
reason_water_body_in_use_name20.0750.1370.0000.0000.2270.0360.0291.0000.0541.0000.0470.0620.0000.0370.0290.0001.0000.2860.033
reason_water_body_in_use_name30.0980.1760.0000.0000.2060.0610.0241.0000.0441.0000.0410.0450.0000.0670.0670.0000.2861.0000.048
renovation_cost0.0820.1400.2740.271-0.0060.0450.065-0.0720.0080.1190.099-0.0990.1590.1030.0400.1620.0330.0481.000

Missing values

2023-12-11T16:59:12.199760image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T16:59:13.053570image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-11T16:59:13.936049image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

Area_TypeState NameDistrict NameWater_Body_TypeWater_Body_StatusReason_for_Water_Body_UseScheme_Status_ReasonWater_Body_Natureconstrucion_yearconstruction_costRenovation_Yearrenovation_costRepair_Renovation_StatusOriginal_Storage_CapacityPresent_Storage_Capacityfilled_up_storage_namefilled_up_storage_space_nameno_people_benefited_by_water_bodyreason_water_body_in_use_name2reason_water_body_in_use_name3
0RuralKERALAKollamPondsYesDomestic/DrinkingNo reported problemsNatural1989.025000.02016.06000.0No60.050.0FullFilled up every year1Ground water rechargePisciculture
1RuralKERALAKollamPondsYesDomestic/DrinkingNo reported problemsNaturalNaNNaN2015.05000.0No60.050.0FullFilled up every year13Ground water rechargePisciculture
2RuralKERALAKollamPondsNoNot SpecifiedSiltationNaturalNaNNaN2014.05700.0No240.0150.0FullFilled up every year681NaNNaN
3RuralKERALAKollamPondsYesDomestic/DrinkingNo reported problemsNaturalNaNNaN2015.05000.0No30.026.0FullFilled up every year12Ground water rechargePisciculture
4RuralKERALAPalakkadPondsNoNot SpecifiedOthersNaturalNaNNaN2002.080000.0No2830.02000.0Upto 3/4Usually filled up681NaNNaN
5RuralKERALAKollamPondsNoNot SpecifiedSiltationNaturalNaNNaN2017.05000.0No240.0180.0FullFilled up every year681NaNNaN
6RuralKERALAKollamPondsNoNot SpecifiedSiltationNaturalNaNNaN2015.05000.0No120.0100.0Upto 3/4Filled up every year681NaNNaN
7RuralKERALAKollamPondsYesDomestic/DrinkingNo reported problemsNaturalNaNNaN2016.06000.0No60.050.0FullFilled up every year1Ground water rechargePisciculture
8RuralKERALAKollamPondsYesDomestic/DrinkingNo reported problemsNaturalNaNNaN2014.03000.0No7500.05250.0FullFilled up every year40Ground water rechargePisciculture
9RuralKERALAKollamPondsYesDomestic/DrinkingNo reported problemsNaturalNaNNaN2010.04000.0No60.050.0FullFilled up every year13Ground water rechargePisciculture
Area_TypeState NameDistrict NameWater_Body_TypeWater_Body_StatusReason_for_Water_Body_UseScheme_Status_ReasonWater_Body_Natureconstrucion_yearconstruction_costRenovation_Yearrenovation_costRepair_Renovation_StatusOriginal_Storage_CapacityPresent_Storage_Capacityfilled_up_storage_namefilled_up_storage_space_nameno_people_benefited_by_water_bodyreason_water_body_in_use_name2reason_water_body_in_use_name3
55724RuralKERALAKasargodPondsYesIrrigationNo reported problemsMan-made1970.010000.02015.010000.0No4320.0000004320.000000FullFilled up every year15Ground water rechargeNaN
55725RuralKERALAKozhikodeWater consv schemes/percolation tanks/check-damsYesGround water rechargeNo reported problemsMan-madeNaNNaN2012.01000000.0No172467.460633142922.561496FullFilled up every year12NaNNaN
55726RuralKERALAKozhikodePondsYesGround water rechargeNo reported problemsMan-made1991.040000.02005.010000.0No144.00000048.000000Upto 1/4Rarely filled up6NaNNaN
55727RuralKERALAKozhikodePondsYesReligiousNo reported problemsMan-made1983.020000.02004.040000.0No160.000000100.000000Upto 3/4Usually filled up25Ground water rechargeNaN
55728RuralKERALAKozhikodePondsYesIrrigationNo reported problemsMan-made1981.010000.02017.05000.0No64.00000032.000000Upto 1/2Rarely filled up10Ground water rechargeNaN
55729RuralKERALAKozhikodePondsYesGround water rechargeNo reported problemsMan-made1969.02000.02012.05000.0No1200.000000600.000000Upto 1/2Rarely filled up100NaNNaN
55730RuralKERALAKozhikodePondsYesPiscicultureNo reported problemsMan-made1988.010000.02006.06000.0No540.000000450.000000Upto 3/4Usually filled up10Ground water rechargeNaN
55731RuralKERALAIdukkiPondsYesIrrigationNo reported problemsMan-made2003.070000.02017.010000.0No210.000000210.000000FullFilled up every year10Domestic/DrinkingNaN
55732UrbanKERALAKollamPondsNoNot SpecifiedOthersMan-madeNaNNaNNaNNaNNo20.00000020.000000Upto 3/4Rarely filled up681NaNNaN
55733RuralKERALAMalappuramPondsYesDomestic/DrinkingNo reported problemsMan-made1990.020000.0NaNNaNNo1400.0000001125.000000FullFilled up every year100NaNNaN

Duplicate rows

Most frequently occurring

Area_TypeState NameDistrict NameWater_Body_TypeWater_Body_StatusReason_for_Water_Body_UseScheme_Status_ReasonWater_Body_Natureconstrucion_yearconstruction_costRenovation_Yearrenovation_costRepair_Renovation_StatusOriginal_Storage_CapacityPresent_Storage_Capacityfilled_up_storage_namefilled_up_storage_space_nameno_people_benefited_by_water_bodyreason_water_body_in_use_name2reason_water_body_in_use_name3# duplicates
99RuralKERALAIdukkiWater consv schemes/percolation tanks/check-damsYesDomestic/DrinkingNo reported problemsMan-madeNaNNaNNaNNaNNo172467.460633142922.561496FullFilled up every year10NaNNaN20
469RuralKERALAThrissurPondsYesDomestic/DrinkingNo reported problemsNaturalNaNNaNNaNNaNNo100.00000090.000000FullFilled up every year10NaNNaN12
355RuralKERALAPalakkadPondsYesIrrigationNo reported problemsNaturalNaNNaNNaNNaNNo150.000000150.000000Upto 3/4Usually filled up15NaNNaN11
313RuralKERALAPalakkadPondsNoNot SpecifiedSiltationNaturalNaNNaNNaNNaNNo300.000000200.000000Upto 3/4Usually filled up681NaNNaN10
247RuralKERALAMalappuramPondsYesOtherNo reported problemsMan-madeNaNNaNNaNNaNNo200.000000150.000000FullFilled up every year1Ground water rechargeIrrigation8
292RuralKERALAPalakkadPondsNoNot SpecifiedOthersNaturalNaNNaNNaNNaNNo200.000000150.000000FullFilled up every year681NaNNaN8
359RuralKERALAPalakkadPondsYesIrrigationNo reported problemsNaturalNaNNaNNaNNaNNo200.000000150.000000FullFilled up every year1OtherNaN8
9RuralKERALAAlappuzhaPondsNoNot SpecifiedSiltationMan-madeNaNNaNNaNNaNNo60.00000050.000000FullFilled up every year681NaNNaN7
17RuralKERALAAlappuzhaPondsNoNot SpecifiedSiltationMan-madeNaNNaNNaNNaNNo200.000000150.000000Upto 1/2Usually filled up681NaNNaN7
217RuralKERALAKozhikodeWater consv schemes/percolation tanks/check-damsNoNot SpecifiedOthersMan-madeNaNNaNNaNNaNNo172467.460633142922.561496FullFilled up every year681NaNNaN7